Quality Outreach Heads-up - Unicode in JDK 23: Removal of COMPAT Locale Provider

The OpenJDK Quality Group is promoting the testing of FOSS projects with OpenJDK builds as a way to improve the overall quality of the release. This heads-up is part of a regular communication sent to the projects involved. To learn more about the program, and how-to join, please check here.

A Quick History of Locale Data in the JDK

Before the Unicode Consortium created the Common Locale Data Repository (CLDR) in 2003 to manage locale data, the JDK had to provide its own collection. It did so successfully and in JDK 8 supported about 160 locales. To reduce maintenance effort, allow better interoperability between platforms, and improve locale data quality, the JDK started to move towards CLDR in 2014:

  • JDK 8 comes with two locale data providers, which can be selected with the system property java.locale.providers:
    • JRE/COMPAT for the JDK’s legacy data collection (default)
    • CLDR for the CLDR data
    • a custom locale provider can be implemented
  • JDK 9 picks CLDR by default
  • JDK 21 issues a warning on JRE/COMPAT

There are plenty of minor and a few notable differences between the legacy data and CLDR - the recently rewritten JEP 252 lists a few of them.

Locale Data in JDK 23

JDK 23 removes legacy locale data. As a consequence, setting java.locale.providers to JRE or COMPAT has no effect.

Projects that are still using legacy locale data are highly encouraged to switch to CLDR as soon as possible. Where that is infeasible, two alternatives remain:

  • Create custom formatters with patterns that mimic the legacy behavior and use them everywhere where locale-sensitive data is written or parsed.
  • Implement a custom locale data provider.

For more details on that as well as on CLDR in the JDK in general, please check JEP 252. It has been recently rewritten to provide better information and guidance.

~